Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers
نویسندگان
چکیده
Automatic extraction of key sentences from academic presentation speeches is addressed. The method makes use of the characteristic expressions used in initial utterances of sections, which are defined as discourse markers and derived in a totally unsupervised manner based on word statistics. The statistics of the discourse markers are then used to define the importance of the sentences. It is also combined with the conventional tf-idf measure of content words. Comprehensive evaluation using the Corpus of Spontaneous Japanese and a variety of experimental setups is presented in this paper. We carefully designed the evaluation scheme to be compared to human performance. The proposed method using the discourse markers shows consistent effectiveness in the key sentence extraction. Based on the indexing, we realize efficient browsing of lecture audio archives.
منابع مشابه
Sentence Extraction by Spreading Activation with Refined Similarity Measure
Although there has been a great deal of research on automatic summarization, most methods are based on a statistical approach, disregarding relationships between extracted textual segments. To ensure sentence connectivity, we propose a novel method to extract a set of comprehensible sentences that centers on several key points. This method generates a similarity network from documents with a le...
متن کاملمقایسه روشهای مختلف یادگیری ماشین در خلاصهسازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت
In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...
متن کاملHow Does Explicit and Implicit Instruction of Formal Meta-discourse Markers Affect Learners’ Oral Proficiency?
Meta-discourse markers are an inevitable part of oral proficiency which improve both the quality and comprehension of learners’ speech. While studies of oral meta-discourse have been conducted since the 1980s in a European or US context, they have remained relatively untouched in Iran. Therefore, this study aimed to seek the impact of both explicit and implicit teaching of formal meta-discourse...
متن کاملThe Analysis of the Discourse Markers in the Narratives Elicited from Persian-speaking Children
Discourse markers (DMs) are linguistic elements that index different relations and coherence between units of talk. Most research on the development of these forms has focused on conversations rather than narratives. This article examines age and medium effects on use of various discourse markers in pre-school children. Fifteen normal Iranian monolingual children, male and female, participated ...
متن کاملTwo-stage Automatic Speech Summarization by Sentence Extraction and Compaction
This paper proposes a new automatic speech summarization method having two stages: important sentence extraction and sentence compaction. Relatively important sentences are extracted from the results of large-vocabulary continuous speech recognition (LVCSR) based on the amount of information and the confidence measures of constituent words. The set of extracted sentences is compressed by our se...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004